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Abstract 

The statistical distribution, when determined from an incomplete set of 
constraints, is shown to be suitable as host for encrypted information. We 
design an encoding/decoding scheme to embed such a distribution with hidden 
information. The encryption security is based on the extreme instability of the 
encoding procedure. The essential feature of the proposed system lies in the 
fact that the key for retrieving the code is generated by random perturbations 
of very small value. The security of the proposed encryption relies on the 
security to interchange the secret key. Hence, it appears as a good complement 
to the quantum key distribution protocol. 

PACS: 05.20.-y, 02.50.Tt, 02.30.Zz, 07.05.Kf 



1 Introduction 

Cryptography is the art of code making and cryptology the art of secure communica- 
tions. Recently, quantum mechanics has made a remarkable entry in the field [1-3]. 
The most straightforward application of quantum cryptology is the distribution of 
secret keys. This problem is refereed in the cryptography literature to as the key 
distribution problem. Classical methods for securing a secret key are based on the 
assumed difficulty of computing certain functions [4,5]. Quantum encryption pro- 
vides a way of agreeing on a secret key without making this assumption. The first 
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quantum key distribution protocol was proposed in 1984 by Bennett and Brassard [6] 
and there are already rigorous proofs of its security [7-9]. The list of recent con- 
tributions concerning the security and implementation of quantum key distribution 
protocols is certainly extensive. Just as a sample one should mention [10-14]. 

The amount of information that can be transmitted by a quantum transmis- 
sion is not very large, but by means of secret-key cryptographic algorithms a large 
amount of information can be secured. In this paper we set the foundations for a 
statistical distribution based encryption procedures, which will be shown to be good 
complements to quantum cryptology. It will be here demonstrated that the statis- 
tical distribution of a physical system is a suitable host for encrypted information 
and we will discuss a method for embedding encrypted messages without affecting 
the physical content of the distribution. 

The fact that a physical system can be assumed to be well described by a particu- 
lar parametric class of statistical distribution entails, in most situations, the assump- 
tion of a great deal of prior information. Indeed, the functional form of most cele- 
brated statistical distribution (Gibbs-Boltzmann, Fermi-Dirac, Bose-Einstein distri- 
butions, etc) can be derived from a few constraints expressed as mean values of some 
observable, and the optimisation of a convex function called entropy or information 
measure [15]. Thus, if a so obtained distribution happens to be the right one to 
describe a particular system, one can think of the optimisation process as replacing 
the information required to determine a unique distribution for the system. 

We would like to think that, when one "guesses" (or derives) a distribution 
from incomplete information, one also generates an "invisible reservoir" to place 
information. This is not an original remark, of course, but an elemental result of 
linear algebra: associated to a rank deficient transformation there are two spaces, 
the range and null spaces of the transformation. The latter is an "invisible" space 
in the sense that all its elements are mapped to zero by the transformation. 
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In this Communication we show that the invisible reservoir is an appropriate 
host for storing covert information. We present an encoding/decoding scheme that 
allows to store a great amount of hidden information as storing the distribution of a 
physical system. The main idea is to make use of the null space of the transformation 
generated by the constraints that should be fulfilled by the distribution. The security 
of the system is guaranteed by designing a (in the popular meaning of the vocable, 
not in the technical one) "chaotic" encoding procedure. This is achieved by means 
of random perturbations to an extremely sensitive encoding process. The random 
perturbations of very small values provide one thereby with the key for recovering the 
code. This is the most remarkable feature of the our proposal: the key for retrieving 
the hidden information is just a tiny number that accounts for the perturbation 
that has been used for encoding purposes. Hence, the relevance of this proposal in 
relation to quantum cryptology, and vice versa, since the security of our proposal 
depends on a secure key interchange. It is appropriate to remark that the most 
notable difference between this approach and chaotic cryptosystems [16, 17] is that 
the theory underlying our approach is essentially a linear one. 

The idea of making use of an "unstable system" for encryption has been suc- 
cessfully applied to over-sampling of Fourier coefficients for transmitting hidden 
messages as transmitting a signal [18]. Here we use equivalent ideas. We assume 
that, in addition to some constraints, we have the information on the process by 
which the statistical distribution is univocally determined. This process is usually 
the optimisation of a convex function (entropy). The particular expression for the 
entropy may be a matter of controversy, though. For our purposes the choice of 
the entropic measure is not relevant at all. What is important here is the convex- 
ity property to ensure a unique solution. The selection of the appropriate entropic 
measure is crucial, of course, to determine the right distribution for the physical 
system. Nevertheless, this has no relation whatsoever with our encryption scheme. 
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The paper is organised as follows: In section II we introduce the notation together 
with an encoding/decoding scheme for embedding a statistical distribution with 
hidden information. The procedure is illustrated by a numerical simulation in section 
III and some conclusions are drawn in section IV. 

2 Embedding the statistic distribution 

We restrict considerations to finite dimensional classical statistic systems, or equiv- 
alently, to a quantum system represented by a distribution constructed from com- 
mutative operators. In both cases the mean value of, say M, physical observa- 
tions x°, x%, • • • , x°, . . . x° M , each of which is the expectation value of a random vari- 
able that takes values x^ n ; n = 1, . . . , N according to a probability distribution 
p n ; n — 1, . . . , N is expressed as: 

N 

x° = ^p n Xi )n ; i = l,...,M 

n=l 
N 

n=l 

Usually the number M of available measurements is much less than the dimension 
N of the probability space. In order to assert a unique distribution for the system 
at hand one has to adopt a decision criterion, which is frequently implemented 
through the maximisation of a convex measure on the probability distribution. Such 
a measure, called entropy or information measure, takes different forms. Here we 
simple assume that the distribution characterising a given physical system is agreed 
to be determined by a fixed set of constraints of the form and the optimisation 
of a convex function that we denote S. 

For the sake of a handy notation we use Dirac notation to represent vectors. 
Thus, the probability distribution is represented as the ket \p) e M. N which, by 



4 



denoting as |n), n — 1, . . . , N the standard basis in M. N , can be expressed as 

N N 

\P) = ^2\ n )( n \p) = ^2Pn\n). (2) 

n=l n=l 

We also define a vector \x°) G R M of components x°,x%, . . . , x° M , 1 and an operator 
i : -y R M+1 given by 

AT 

A = ^K)(n\. (3) 

n=l 

Vectors |x„) G R M+1 , n = 1, . . . , iV are defined in such a way that (i\x n ) = x^ n , i = 
1, . . . , M + 1 with xm+i,u = 1- Hence, 

M+l M+l 
\x n ) = ^ KX^kn) = ( 4 ) 

1=1 i=l 

We are now in a position to joint the constraints ((TJ) together in the equation 

\x°) = A\p). (5) 

Since A is a rank deficient operator, we know from elemental linear algebra that the 
general solution to the under-determined system (jSJ) can be expressed in the form: 

\p) = A rl \x°) + \p'), (6) 

where A' is the pseudo inverse of A (i.e. the inverse of the restriction of A to 
range(A)) and \p') a vector in the null space of that operator. Consequently, all the 
information the probability distribution contains concerning the data is expressible 
in the fashion A' \x°). On the contrary, the component \p') is completely indepen- 
dent of the data, but strongly dependent on the selection criterion that is adopted 
to decide on one particular solution among the infinitely many solutions the system 
© has. 

The fact that all distributions of the form \p) = A' \x°) + \p') with \p') G 
null(A) are capable of reproducing the constraints vector \x°) provides us with a 
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framework for the purpose of storing encrypted information while storing the statis- 
tical distribution of a physical system. At the encoding step we make the following 
assumptions: 

i) The number M + 1 of the independent linear equations which are used to 
determine the statistical distribution of a given system is fixed. The expected 
values generating the equations are assumed to be known. 

ii) The probability distribution characterising the system arises by optimisation 
of a convex function S, subjected to the M + 1 linear constraints described 
above. 

Assumptions i) and ii) entail the availability at this stage, of vector \p) and 
operator A. The vectors spanning the range and null spaces of this operator can be 
determined by computing the the eigenvectors of operator G = A^A. Let us denote 
as \r/ n ), n = 1, . . . , N — (M + 1) the normalised eigenvectors corresponding to zero 
eigenvalues. We use these vectors to define operator U : M. N i— > R JV_ ( M+1 ) as 

N-M-l 
n=l 

This operator is termed decoding operator, and its adjoint, W, encoding operator. 
Using W a basic code of N — (M + 1) numbers is constructed as follows: Let the 
iV — (M + l)numbers be the iV — (M + l)-components (n\q) = q n ,n = 1, . . . , N — 
(M + 1) of vector \q) G R N ^ M+1 ) and define: 

AT-Af-l 

\Pc) = U ] \q)= E |iM>(n|?>. (8) 

n=l 

Given a distribution \p), amenable to be determined from the optimisation of an 
entropy measure S and a set of M + 1 constraints, the code \q) is embedded in the 
distribution through the process below. 
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Encoding process 

• Compute vector \p c ) as in (JHJ). 

• Add \p) and \p c ) to construct 

\P) = \P) + \Pc}- 

Decoding process 

• Use the vector \p) to recover the data \x°) as 

\x°) = A\p) 

• Use the data \x°) to determine, by optimisation of 5*, the distribution \p). 
From \p) and \p) compute vector 

\Pc) = \P) - \p). 

• Use the decoding operator U to obtain the encrypted code by noticing that, 
since UW is the identity operator in R Ar ~( M+1 ) ; from (jHJ) one has: 

\q) = u\p c ). (9) 

Note that the success of the above encoding/ decoding scheme relies on the possibility 
of ordering the eigenvectors in the null space. This is perfectly possible by fixing 
the numerical method for computing the eigenvectors of G. However, the process is 
extremely unstable, as a tiny perturbation to any of the matrix elements of operator 
G produces a huge effect in the eigenvectors of zero eigenvalues. This "chaotic" (in 
the popular sense) behaviour of the eigenvectors in the null space provides, naturally, 
the security key for retrieving the encrypted code. Indeed, consider that e is a very 
small number (order of 10~ 13 , say) that we add to one of the matrix elements of G. 
Such a tiny perturbation does not yield any detectable effect in the reconstruction 
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of the distribution \p) but an enormous effect with regard to the eigenvectors in the 
null space. Hence, as illustrated by the examples of the next section, the perturbation 
e provides the safety key of our encoding/decoding scheme. 

2.1 Numerical Examples 

Consider that a probability distribution concerning an event space of dimension 
N = 401 is appropriately determined, by the Jaynes maximum entropy formalism, 
from the normalisation to unity constraint and the first four moments of a random 
variable x n , n = 1, . . . , 401 that takes values ranging from x\ — — 1 to x^i = 1 with 
uniform increment A = 1/400. Thus the distribution, which arises by maximisation 
of the Shannon's entropy S = — Y^=iPn^ n Pn subjected to the given constraints, 
has the form: 

„ — ~ — A — \iX n — \2 x n ^3 x n~~ X^X^ 

fn c 

401 

A _\~~2_\~~ -\.~* 



e 

n=l 



e -\ 1 x n -\ 2 xl-\ 3 x n -\ 4 ,xf l 



The parameters Ai, A2, A3 and A4 are determined from the equations 

E401 „i p -\\x n -\2x\-\- A x z n -\^x\ 
n=l • v n c - ■ _ -1 a 



(11) 



i ~ y-401 X 1Xn -X 2 xl-X 3 xl-X i xi ' ' ' ' ' ' 

Z-/n=l 

For x° = -0.0224, x° = 0.1048,4 = -0.0124, x\ = 0.0284 one obtains A x = 
—0.3, A2 = 3, A3 = 2, A4 = 3.8. The operator A has a 5 x 401 matrix representation 
of elements x ijn = x l n) % — 1, . . . , 4; n — 1, . . . , 401 and x 5in — 1, n — 1, . . . , 401. We 
construct operator G = A^A and compute its eigenvectors. The 396 eigenvectors 
corresponding to zero eigenvalues are used to construct the encoding operator W 
to encrypt a code \q) of 396 numbers. These numbers, each of which consists of 15 
digits, are taken randomly from the [0 , 1] interval. We now proceed as indicated in 
the encoding process of the previous section: We construct the vector \p c ) = U^\q) 
and add it to the distribution \p) to obtain vector \p) = \p) + \p c ). This vector con- 
tains both, the information on the physical system and the code. In order to retrieve 
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such information we use \p) to generate the constraints as x° = (i\A\p), i = 1, . . . , 5. 
Since, by construction, A\p c ) = 0, the constraints are generated from \p) with high 
precision. We use then this values to solve for the parameters of the distribution so 
us to recover \p). Vector \p) allows to obtain vector \p c ) from the available vector 
\p) as \p c ) = \p) — \p). The code is thus retrieved by the operation \q) = W\p c ). 
Table 1 gives five of the 396 code numbers. The second column corresponds to the 
reconstructed numbers. As can be observed the quality of the reconstruction is ex- 
cellent. In order to give a measure assessing the reconstruction of all numbers, let 
us denote by |g r ) the reconstructed code and define the error of the reconstruction 
as 5 r = HI?) - |<f)ll- The value of 5 T is in this case 4.8 x 1(T 14 . 

Let us now distort the matrix representation of operator G by adding a number 
e = 2.9 x 1CT 13 to one of its elements, say the element at the first row and fifth 
column. If we repeat the process using the distorted matrix the outcomes are the 
following: The perturbation has no detectable effect in the reconstruction of the 
distribution \p). However, if we intend to reconstruct the code without considering 
the perturbation, what we obtain has no relation whatsoever with the true code 
(see the 3rd column of Table 1). The error of the reconstruction is <5 r = 17.25. 
Since the recovery of the code is only possible if the value of the perturbation is 
known, the key for recovering the code is the value of the perturbation and the 
to numbers labelling the element that has been distorted, in this case (1,5). Of 
course, rather than distorting one matrix element we may wish to distort a random 
number of them. In such a case the decoding key becomes a string of ordered 
pairs of natural numbers, indicating the elements that were randomly selected to 
be distorted and the corresponding values of the perturbations. Moreover, to avoid 
attacks of the type known plaintext attack [4, 5] , in which the attacker is supposed to 
have collected correctly decrypted message in order to use them to decrypt others, 
one can proceed as follows: maintaining one perturbation secret as the key for 
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Code numbers 


reconstruction 


disregarding perturbation 


0.43596704982551 
0.82124392828478 
0.66471591347310 
0.64554449242809 
0.84398334243978 


0.43596704982551 
0.82124392828478 
0.66471591347310 
0.64554449242809 
0.84398334243978 


-0.04525907549619 
0.18665860768833 
0.19297513576254 
0.08172338626606 
-0.14994519073447 



Table 1: 5 code numbers and their reconstruction. Assuming the perturbation to 
be known (second column) and otherwise (third column) 

decryption, other perturbations are made public and are different for every message. 
This avoids the repetition of the encoding operator with the same key. Thus, the 
knowledge of decrypted messages does not provide information on the encoding 
operator to encrypt other messages with the same key. This prevents thereby the 
possibility of known plaintext attacks. 

We would like to stress that the success of the proposed procedure strongly de- 
pends on the use of the identical numerical method to obtain the same basis of 
null(G) in the encoding and decoding process. Nevertheless, the procedure does not 
depend on the machine processor. In the examples presented here the encoding was 
performed in powerful computer cluster, and the decoding in a laptop, using Matlab 
6.5. Let us also remark that only the precision in the representation of operator G 
is crucial for the code reconstruction, since the numerical errors in determining \p c ) 
are not magnified. This is due to the fact that, since WU is the identity operator 
in R jV_ ( M + 1 ) ) the inverse recovering of \q) from \q) = U\p c ) is very stable against 
perturbations of \p c ). 

3 Conclusions 

An encoding/decoding scheme for embedding hidden information into the statistic 
distribution of a physical system has been presented. The encryption security is 
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based on the extreme instability of the encoding process, which is endowed with the 
following feature: a tiny perturbation to the matrix yielding the eigenvalues used to 
construct the encoding operator produces a huge effect in the recovery process. 

Thus, the key for retrieving the code is given by the value of the perturbation. 
The security for interchanging the key is, of course, essential, but we can rely on the 
secure quantum key distribution protocol to ensure a safe key delivery. Conversely, 
the quantum protocol can make use of the proposed setting for encryption, as it 
entails the transition of very little information through the quantum channel. 

A remarkable property of making use of a statistical distribution for the purpose 
of storing encrypted information is the fact that, the larger the dimension of the 
distribution is, the larger the amount of encrypted information that can be stored. 
This opens the possibility of devising more sophisticated encryption algorithms than 
the one advanced here, yet based on the same principles. 

We believe that the results we have presented are certainly encouraging and feel 
confident that they will contribute to many fruitful discussions and follow-up work 
in the subject. Finally we would like to stress that the proposed scheme is not 
restricted to be applied only on physical distributions. When a physical distribution 
is involved, one has also stored the information on the physical system. Hence 
the importance of using an appropriate entropic measure for the encoding/ decoding 
process. If the entropic measure is the right one, after recovering the statistical 
distribution it can be used to make correct predictions on the expected values of 
physical quantities which are not experimentally available. 
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